2019-04-24

Why R is cool?

Animated graphs

Animated graphs

Interactive graphs

Interactive graphs

Interactive graphs

Crazy stuff…

Reproducible science

Data science

Flow chart

The data pipeline

How to prepare a working directory

  • data folder contains all input data (and metadata) used in the analysis;
  • The doc folder contains the manuscript;
  • The figs directory contains figures generated by the analysis;
  • The output folder contains any type of intermediate or output files;
  • The R directory contains R scripts with function definitions;
  • The reports folder contains RMarkdown files that document the analysis or report on results;

File names and type of data files

File paths

# Absolute path ------------------------------
"C:/Users/zephi/Dropbox/R-crash-course/figs/cropped-rstudio.png"

# Relative path ------------------------------
"figs/cropped-rstudio.png"

The R studio IDE

Exercise

  • create data;
  • organize folder;
  • create R-project;
  • load data into R;

Reading and writing data

cheatsheet

# Load data into R from excel
library(readxl)
df <- read_xlsx("data/fishing_effort.xlsx")

Scripting

  • Start your analysis from copies of your raw data;
  • Any cleaning, merging, transforming, etc. of data should be done in scripts, not manually;
  • Split your workflow (scripts) into logical thematic units;
  • Eliminate code duplication by using functions;
  • Document your code and data;
  • Keep intermediary outputs separate from raw data;

R-packages and GitHub

Ask for help

Useful help

  • Google/StackOverflow
  • R community on twitter
  • GitHub

How to ask for help

  • Google extensively…
  • You will not get an answer if you duplicate questions (geeks know everything)
  • There is nothing more disrespectful than being lazy
  • 9 times out of 10 someone already had the same or similar problem as you
  • “reproducible example” (reprex)
  • You will not get an answer without a reprex.

Further reading